Method for cloning and expression of Rhodothermus obamensis DNA polymerase I large fragment in E. coli

ABSTRACT

The present invention provides a novel thermostable DNA polymerase I obtainable from  Rhodothermus obamensis , which possesses 3′-5′ exonuclease activity and has a half-life of about 35 minutes at 94° C. This polymerase also contains a tyrosine residue in the ribosome binding site which improves incorporation of dideoxyribonucleic acids. Also provided are isolated DNA and vectors encoding this polymerase, as well as its large fragment, and methods for producing recombinant enzyme using the same.

BACKGROUND OF INVENTION

[0001] The present invention relates to a novel thermostable DNA polymerase I from Rhodothermus obamensis, which possesses 3′-5′ exonuclease activity and has a preliminary estimated half-life of 35 minutes at 94° C., as well as methods for cloning and producing the large fragment of R. obamensis DNA polymerase I, as well as isolated DNA encoding this enzyme and vectors containing the same.

[0002] DNA polymerases are important enzymes involved in chromosome replication and repair. These enzymes have also been employed in DNA diagnostics and analysis. In several of these applications, including PCR, thermocycle sequencing, and iso-thermal strand displacement amplification, DNA polymerases must maintain enzymatic activity at temperatures from 50° C.-95° C. One advantageous source for such polymerases is thermophiles. Here we describe a method for purifying, cloning and expressing Rhodothermus obamensis DNA polymerase I large fragment in E. coli.

[0003]E. coli DNA polymerase I and T4 DNA polymerase were cloned, purified and characterized previously (Joyce C. M. and Derbyshire V. Methods in Enzymology, 262:3-13, (1995); Nossal N. G. et al. Methods in Enzymology, 262: 560-569, (1995)). These enzymes have a variety of uses in recombinant DNA technology including DNA labeling by nick translation, second-strand cDNA synthesis in cDNA cloning, and DNA sequencing.

[0004] U.S. Pat. Nos. 4,683,195, 4,683,202 and 4,800,159 disclosed the use of DNA polymerases in a process for amplifying, detecting, and/or cloning nucleic acid sequences. This process, commonly referred to as polymerase chain reaction (PCR), involves the use of a polymerase, primers and nucleotide triphosphates and amplifying existing nucleic acid sequences.

[0005] A number of thermostable DNA polymerases have been isolated and cloned from thermophilic eubacteria. The thermostable Bst DNA polymerase from Bacillus stearothermophilus and the Bca DNA polymerase from Bacillus caldotenax have been cloned and expressed in E. coli (Aliotta J. M. et al. Genetic Analysis: Biomol. Engin, 12:185-195, (1996); Uemori, T. et al. J. Biochem. 113:401-410, (1993)). These two DNA polymerases have been used in strand displacement amplification (Milla, M. A. et al. Biotechniques, 24:392-395, (1998)).

[0006] DNA polymerases have also been cloned from a number of Thermus species such as T. aquaticus (Lawyer, F. C., et al. J. Biol. Chem. 264:6427-6437 (1989)). T. thermophilus (Asakura, K. et al. J. Ferment. Bioeng., 76:265-269, (1993), and T. filiformis (Jung, S. E. et al. GenBank Accession No. AF030320, (1997)). These characterized Thermus-DNA polymerases, belonging to the Family A DNA polymerases, exhibit 5′-3′ exonuclease activity while lacking 3′-5′ proof-reading exonuclease activity. For thermocycling sequencing, a Taq DNA polymerase variant called ThermoSequenase (F667Y) has been constructed that efficiently incorporates dideoxy terminators and dye-terminators (Tabor S. and Richardson C. C., Proc. Natl. Acad. Sci. USA, 92:6339-6343, (1995); Vander Horn P. B. et al. Biotechniques, 22:758-765, (1996)). Although readable DNA sequence for one sequencing reaction has improved from 300 bp to about 600 bp, further technical improvements are needed to achieve 1000 or more bases of reliable sequence for each reaction. Such improvement most likely requires the introduction of new DNA polymerases such as thermostable T7-like DNA polymerases.

[0007] Research was conducted on the isolation and purification of DNA polymerases from Thermus aquaticus (Chien, A. et al. J. Bacteriol. 127:1550-1557, (1976)). The publication of Chien, A. et al. discloses the isolation and purification of a DNA polymerase with a temperature optimum of 80° C. from T. aquaticus YT1 strain. The Chien et al., purification procedure involves a four-step process. These steps include preparation of crude extract, DEAE-Sephadex chromatography, phosphocellulose chromatography and chromatography on DNA cellulose.

[0008] U.S. Pat. No. 4,889,818 discloses a purified thermostable DNA polymerase from T. aquaticus, Taq DNA polymerase, having a molecular weight of about 86,000 to 90,000 daltons prepared by a process substantially identical to the process of Kaledin with the addition of the substitution of a phosphocellulose chromatography step in lieu of chromatography on single-strand DNA-cellulose. In addition, European Patent Application 0258017 disclose Taq polymerase as the preferred enzyme for use in the PCR process discussed above. Research has indicated that while the Taq DNA polymerase has a 5′-3′ polymerase-dependent exonuclease function, Taq DNA polymerase does not possess a 3′-5′ proofreading exonuclease function (Lawyer, F. C., et al. J. Biol. Chem. 264:6427-6437 (1989)). As a result, Taq DNA polymerase is prone to base incorporation errors, making its use in certain applications undesirable. For example, attempting to clone an amplified gene is problematic since any one copy of the gene may contain an error due to a random misincorporation event. Depending on where in the replication cycle that error occurs (e.g., in an early replication cycle), the entire DNA amplified could contain the erroneously incorporated base, thus, giving rise to a mutated gene product.

[0009] Accordingly, it would be desirable to clone and produce a thermostable DNA polymerase with 3′-5′ proof-reading exonuclease activity that may be used to improve the fidelity of DNA amplification reactions described above. It would also be desirable to clone a thermostable and processive DNA polymerase which efficiently incorporates dye terminators.

SUMMARY OF THE INVENTION

[0010] In accordance with the present invention, there is provided a novel thermostable DNA polymerase I from Rhodothermus obamensis, which possesses 3′-5′ exonuclease activity and has a preliminarily estimated half-life of 35 minutes at 94° C. This thermostable enzyme obtainable from Rhodothermus obamensis, a thermophile isolated from a shallow marine hydrothermal vent in Tachibana Bay, Japan, has a molecular weight of about 104 kDa, and possesses a tyrosine residue in the ribosome binding domain which increases the incorporation rate of dideoxynucleotides.

[0011] Also provided by the instant invention are methods for cloning and producing the large fragment of R. obamensis DNA polymerase I, as well as isolated DNA encoding this enzyme and vectors containing the same. The Rhodothermus obamensis DNA polymerase I large fragment has a molecular weight of about 71 kDa.

BRIEF DESCRIPTION OF THE DRAWINGS

[0012]FIG. 1 is the nucleotide sequence (SEQ ID NO:1) and the predicted amino acid sequences (SEQ ID NO:2) of R. obamensis DNA polymerase I.

[0013]FIG. 2 is the nucleotide sequence (SEQ ID NO:3) and the predicted amino acid sequences (SEQ ID NO:4) of R. obamensis DNA polymerase I large fragment.

[0014]FIG. 3 is the SDS-PAGE gel showing the purification steps for recombinant R. obamensis DNA polymerase I large fragment. Lane 1 and 3, IPTG-induced cell extract after heat treatment; lane 2 and 4, non-induced cell extract after heat treatment; lane 5 and 7, protein size marker (7 to 212 kDa); lane 6, partially purified recombinant R. obamensis DNA polymerase I large fragment. Arrow I, indicating recombinant R. obamensis DNA polymerase I large fragment; arrow II indicating E. coli GroEL protein.

[0015]FIG. 4 illustrates the thermostability of the recombinant R. obamensis DNA polymerase I large fragment at 94° C. The polymerase assay was carried out at 65° C. for 20 min after incubation of the DNA polymerase at 94° C. for 1 to 40 min.

DETAILED DESCRIPTION OF THE INVENTION

[0016]Rhodothermus obamensis was isolated from a shallow marine hydrothermal vent in Tachibana Bay, Japan. It can grow in the temperature range of 50 to 85° C. with optimal growth temperature at 80° C. The pH range for growth media is pH 5.5 to 9.0. It can be cultured in a marine broth with NaCl concentration of 1 to 5%. In a preferred embodiment, the type strain is Rhodothermus obamensis OKD7 (Sako Y. et al. Int. J. Syst. Bactriol. 46:1099-1104, (1996)).

[0017] Purification of R. Obamensis DNA Polymerase I

[0018] The native or recombinant R. obamensis DNA polymerase can be purified by the following procedure:

[0019] Cells are resuspended in a lysis buffer (50 mM Tris-HCl, pH 8, 1 mM EDTA, 5 mM DTT) and lysed by sonication. Pulverized ammonium sulfate is added slowly with gentle stirring to a final concentration of 30% (W/V), and the suspension is allowed to sit at 40° C. overnight. The ammonium sulfate precipitate is collected by centrifugation in a rotor at 12,000 rpm for 30 min. The supernatant is discarded. The pellet is resuspended in a buffer containing 50 mM Tris-HCl, pH 8, 10% glycerol, 1 mM EDTA, 5 mM DTT. The R. obamensis DNA polymerase I may be further purified by chromatography, for example:

[0020]R. obamensis DNA polymerase I may be purified by phosphocellulose chromatography (Whatman cellulose phosphate ion-exchange resin P11). Fractions may be assayed for thermostable DNA polymerase activity and peak fractions may be pooled and dialysed.

[0021]R. obamensis DNA polymerase I may be purified by DEAE chromatography (Whatman ion exchange cellulose DE52 resin). Fractions may then be assayed for thermostable DNA polymerase activity and peak fractions can be pooled and dialysed.

[0022]R. obamensis DNA polymerase I may be purified, as in a preferred embodiment, by DNA binding affinity column chromatography (Heparin sepharose or Heparin TSK). Fractions may be assayed for thermostable DNA polymerase activity, and peak fractions may be pooled and dialysed.

[0023]R. obamensis DNA polymerase I can be purified by Mono Q FPLC. Fractions may be assayed for thermostable DNA polymerase activity. Peak fractions may be pooled and dialysed.

[0024]R. obamensis DNA polymerase I may be further purified by Mono S FPLC. Fractions may then be assayed for thermostable DNA polymerase activity, and peak fractions can be pooled and dialysed in a storage buffer with 50% glycerol.

[0025] Alternatively, recombinant R. obamensis DNA polymerase I may be purified by affinity purification via the use of a fusion protein. For example, fusion of R. obamensis DNA polymerase I to maltose binding protein, chitin binding protein, GST, or His tag. After the fusion protein is purified, the affinity tag may be removed by a protease or by controlled protein splicing/cleavage reaction. (U.S. Pat. Nos. 5,643,758 and 5,834,247.)

[0026] Cloning of R. Obamensis DNA Polymerase I

[0027] The method described herein by which the R. obamensis DNA polymerase I gene is cloned and its large fragment is expressed includes the following steps:

[0028] 1. The genomic DNA is purified from R. obamensis cells.

[0029] 2. Conserved regions in DNA polymerase I are found by nucleotide sequence comparison of Pol I type DNA polymerases from Eubacteria and especially thermophilic bacteria. Based on the conserved sequences, one set of degenerate primers is designed and an initial PCR is carried out using the degenerate primers to amplify part of the R. obamensis DNA polymerase I. A 609 bp DNA fragment in the DNA polymerase domain is amplified and sequenced.

[0030] 3. Single stranded DNA primers are designed based on the initial 609 bp sequence. Inverse PCR is used to amplify upstream and downstream DNA sequences. R. obamensis genomic DNA is digested with restriction enzymes with 4-6 bp recognition sequences, giving rise to reasonable size template DNA for inverse PCR reactions. The digested DNA is self-ligated at a low DNA concentration. The ligated circular DNA is used as templates for inverse PCR reaction using a set of primers that annealed to the left or right ends of the initial fragment. The inverse PCR products are purified in low-melting agarose gel and sequenced directly using primers. The newly derived DNA sequences are compared with sequences in GenBank using BlastX program. This step is repeated until the start codon was found upstream and stop codon was found downstream. The entire DNA polymerase gene is found to be 2772 bp long, encoding a protein with predicted molecular weight of 104.7 kDa.

[0031] 4. The 3′-5′ exonuclease domain is compared with that of E. coli DNA polymerase I. It is found that R. obamensis DNA polymerase I contains three conserved motifs of 3′-5′ exonuclease. The three conserved motifs have the following amino acid sequence: motif I, DTE, motif II, NLKYD, motif III, YACED. It is concluded that R. obamensis DNA polymerase I may contain 3′-5′ exonuclease proofreading activity. In addition, R. obamensis DNA polymerase I contains a Tyr residue (Y761) in the ribose binding region (E. coli O helix homolog). It's known that Tyr residue at this position increases the incorporation rate for dideoxynucleotides.

[0032] 5. To overexpress the large fragment of R. obamensis DNA polymerase I, 888-bp DNA encoding N-terminus 5′-3′ exonuclease domain is deleted by PCR. The deletion variant lacking 5′-3′ exonuclease region is 1884 bp long, encoding the 628-aa DNA polymerase I large fragment with predicted molecular weight of 71.3 kDa. This R. obamensis DNA polymerase I large fragment is similar to E. coli Klenow fragment, but it contains 28 extra amino acid residues at the N-terminus. The DNA coding for the large fragment is amplified by PCR, digested with NdeI and BamHI and cloned into a T7 expression vector pAII17. One clone #7 is further characterized.

[0033] 6. E. coli cells ER2566 [pAII17-Rob polI large fragment] is cultured to late log phase and induced by addition of IPTG (R. obamensis is abbreviated as Rob). Cell extract is prepared and heated at 65° C. for 30 min. Heat-denatured E. coli proteins were removed by centrifugation and the supernatant is assayed at 65° C. for DNA polymerase activity on activated calf thymus DNA. It is found that the large fragment has thermostable DNA polymerase activity.

[0034] 7. R. obamensis DNA polymerase I large fragment is purified by chromatography through Heparin-Sepharose column. The large fragment is partially purified. Another protein of 60 kDa is copurified with R. obamensis DNA polymerase I large fragment. To determine if this 60 kDa protein is a protease degradation product, the N-terminus of the 60 kDa protein is sequenced. The first 15 residues are compared with known proteins in protein data base. It has 100% identity to E. coli GroEL protein.

[0035] 8. To determine the half-life of the partially purified large fragment, the protein is heated at 94° C. for 1 to 40 min. Samples are taken and assayed for remaining DNA polymerase activity. It is found that R. obamensis DNA polymerase I large fragment has an half-life of 35 min at 94° C.

[0036] The following Examples are given to illustrate embodiments of the present invention as it is presently preferred to practice. It will be understood that these Examples are illustrative, and that the invention is not to be considered as restricted thereto as indicated in the appended claims.

[0037] The references cited above and below are herein incorporated by reference.

EXAMPLE I Cloning of R. obamensis DNA Polymerase I Gene

[0038]Rhodothermus obamensis (JCM 9785, Japan Collection of Microorganisms, Wako-shi, Saitama, Japan) was cultured in Bacto marine broth at 70° C. overnight. Cells from one liter of culture were collected by centrifugation. Genomic DNA was prepared from the cell pellet by the standard procedure. A set of degenerate primers were designed based on the conserved amino acid sequence in the DNA polymerase domain. The primers have the following sequences: (SEQ ID NO:5) 5′-TCCGA(C/T)CCCAACCT(G/C)CAGAACATCCC-3′ 138-151 (SEQ ID NO:6) 5′-AGGA(G/C) (G/C)AGCTCGTCGTG(G/C)ACCTG-3′ 138-152

[0039] (G/C) indicates degenerate position, G or C.

[0040] Primers 138-151 and 138-152 were used to amplify a portion of R. obamensis DNA polymerase I in PCR under the following condition: 95° C. for 30 sec, 50° C. for 1 min, 72° C. for 1 min, 35 cycles, 2.5 units of Taq plus Vent® DNA polymerase (50:1 ratio). A ˜600 bp PCR product was found. The PCR product was gel-purified in low-melting agarose gel and sequenced directly by thermocycling sequencing using primer 138-151 which generated a 609 bp DNA fragment. When this DNA fragment was translated into amino acid sequence and compared to known proteins in GenBank, it was found that it has 50% aa sequence identity to E. coli DNA polymerase I (pol I) and 54% aa sequence identity to Taq DNA polymerase.

[0041] Two primers were synthesized based on the known 609 bp DNA sequence. They have the following sequences: 5′-CGCAGGGCGTTTGTGCCGCGG-3′ 202-154 (SEQ ID NO:7) 5′-GTCTCCCGCCCCATCTCGGTG-3′ 202-155 (SEQ ID NO:8)

[0042]R. obamensis genomic DNA was digested individually with the following restriction enzymes: AvaI, BsaAI, BsaHI, BstNI, EagI, HaeII, HhaI, HincII, MspI, NcoI, NspI, SacII, Sau3AI, TaqI, TseI, Tsp45I, BanI, or AluI. After restriction digestion, the DNA was purified by phenol-CHCl₃ extraction and ethanol precipitation. The digested DNA was self-ligated at a low DNA concentration (2 ug/ml). T4 DNA ligase was inactivated by heating at 65° C. for 30 min and the DNA was precipitated and resuspended in TE buffer. The self-ligated genomic DNA was used in inverse PCR to amplify the remaining portion of the DNA polymerase I gene. The following condition was used in inverse PCR: 95° C. for 30 sec, 55° C. for 30 sec, and 72° C. for 2 min, 30 cycles. Inverse PCR products were found in BsaHI, HaelI, NcoI, and NspI digested and self-ligated DNA templates. The NcoI inverse PCR fragment was the largest, giving rise to about 1950 bp of new DNA sequence (2550 bp−600 bp=˜1950 bp). This fragment was gel-purified in low-melting agarose gel and sequenced directly using primers 202-154 and 202-155. Four new primers were made to finish sequencing the NcoI fragment.

[0043] Two new inverse PCR primers were made to amplify the DNA beyond the NcoI site. The two primers have the following sequences: 5′-GCCGGCCGCTTGTCAACTCGA-3′ 205-7 (SEQ ID NO:9) 5′-TGATGAACACGTATTGCGCCC-3′ 205-8 (SEQ ID NO:10)

[0044]R. obamensis genomic DNA was digested with restriction enzymes AvaI, BsaHI, BstNI, SacII, Sau3AI, TaqI, TseI, Tsp45I, BanI, AluI and self-ligated as described above. The ligated genomic DNA was used in inverse PCR. Inverse PCR condition was 95° C. for 30 sec, 55° C. for 30 sec, and 72° C. for 2 min, 35 cycles. Inverse PCR products were found in Sau3AI, TaqI, and TseI digested and self-ligated DNA. The inverse PCR products were gel-purified and sequenced which gave rise to 27 bp of new DNA sequence. A start codon was found in the newly derived sequence.

[0045] To amplify the C-terminus coding region of R. obamensis DNA polymerase I, two inverse PCR primers were made: (SEQ ID NO:11) 5′-GAAGCGGGAAGGCTACCGGGCCAA-3′ 204-7 (SEQ ID NO:12) 5′-AGTCGGTGGTAGATGTGCACCATG-3′ 204-8

[0046] Inverse PCR condition was 95° C. for 30 sec, 55° C. for 30 sec. and 72° C. for 2 min, 35 cycles. Inverse PCR products were found in HaeII, NspI, Sau3AI, and Tsp45I digested and self-ligated templates. The inverse PCR products were gel-purified and sequenced which gave rise to the C-terminus coding region. The entire R. obamensis DNA polymerase gene is 2772 bp long, encoding a protein with predicted molecular weight of 104.7 kDa (FIG. 1). Unlike Taq DNA polymerase, R. obamensis DNA polymerase I contains three conserved 3′-5′ exonuclease motifs. The three conserved motifs have the following amino acid sequence:

[0047] motif I, DTE

[0048] motif II, NLKYD

[0049] motif III, YACED.

[0050] It is concluded that R. obamensis DNA polymerase I may contain 3′-5′ exonuclease proofreading activity. In addition, R. obamensis DNA polymerase I contains a Tyr residue (Y761) in the ribose binding region (E. coli O helix homolog). It's known that Tyr residue at this position increases the incorporation rate for dideoxynucleotides. Pol I-like DNA polymerases that have a Tyr residue at the ribose selectivity site include DNA polymerases from phage T7 and T3, yeast mitochondria, Mycobacterium tuberculosis, Mycobacterium leprae, Rhodothermus obamensis, and Rhodothermus sp. ‘ITI518’.

EXAMPLE II Expression of R. obamensis DNA Polymerase I Large Fragment

[0051] To construct a large fragment of R. obamensis DNA polymerase I, 888-bp DNA encoding N-terminus 5′-3′ exonuclease domain was deleted. The deletion variant lacking 5′-3′ exonuclease region is 1884 bp long, encoding 628-aa DNA polymerase I large fragment with predicted molecular weight of 71.3 kDa. This R. obamensis DNA polymerase I large fragment is similar to E. coli Klenow fragment, but it contains 28 extra amino acid residues at the N-terminus (FIG. 2). The DNA coding for the large fragment was amplified by PCR under the PCR condition of 95° C. for 30 sec, 55° C. for 30 sec. and 72° C. for 2 min, 20 cycles, 2 units of Vent® DNA polymerase. The PCR primers have the following sequence: 5′-CTGGCCGGCCATATGAACGGCGAAGCCGCCTTGGATGAG-3′ 204-146. (CATATG= NdeI site). (SEQ ID NO:13) 5′-GTTGGATCCGCTTCAGTGGGCATCCAGCCAGTTGTC-3′ 204-147. (GGATCC= BamHI site). (SEQ ID NO:14)

[0052] The amplified PCR product was digested with NdeI and BamHI and inserted into a T7 expression vector pAII17 precut with NdeI and BamHI. The ligated DNA was used to transform E. coli competent cell ER2566. Eighteen Amp^(R) transformants were screened for insert. Six plasmids contained the correct size insert (#2, #5, #6, #7, #12, and #14). To test DNA polymerase activity in all six isolates, E. coli cells ER2566 [pAII17-Rob-polI-large fragment] were cultured to late log phase and induced by addition of IPTG to 0.5 mM concentration (R. obamensis is abbreviated as Rob). Cell extract was prepared by sonication and centrifugation. The cleared lysate was heated at 65° C. for 30 min. Heat-denatured E. coli proteins were removed by centrifugation and the supernatant was analyzed on an SDS-PAGE gel (FIG. 3, lanes 1-4) and was assayed at 65° C. for DNA polymerase activity on activated calf thymus DNA. The DNA polymerase activity was performed in a total of 50 ul volume at 65° C. It contains 20 ul of cell extract, 5 ul (10 ug) of activated calf thymus DNA, 1 ul of dNTP (5.4 mM), 5 ul of 10× thermopol buffer, 1 ul of [³H]TTP, 18 ul of sdH₂O. The components of 1× Thermopol buffer are 10 mM KCl, 20 mM Tris-HCl, pH 8.8, 10 mM (NH₄)₂SO₄471 , 2 mM MgSO₄, 0.1% Triton X-100. Following incubation at 65° C. for 20-30 min, the entire volume was spotted on to DE81 membrane discs and dried under a heating lamp for 30 min. The membranes were washed 2× in 500 ml of 10% TCA. The acid-insoluble [³H]TMP incorporated DNA was counted in scintillation counting solution. It was found that isolates #2, #5, #7, #12, and #14 have thermostable DNA polymerase activity. #7 and #12 displayed highest activity. #7 was chosen to be further characterized. Two liters of cells of #7 clone were induced with IPTG and cell extract was prepared by sonication and centrifugation. The cell extract was heated at 65° C. for 30 min and the denatured E. coli proteins were removed by centrifugation. R. obamensis DNA polymerase I large fragment was purified by chromatography through Heparin-Sepharose column. R. obamensis DNA polymerase I large fragment was eluted with 50 mM to 1 M NaCl gradient. Fractions 19 and 20 contained the most DNA polymerase activity. Proteins from fractions 15 to 20 were analyzed on an SDS-PAG gel. Two major proteins were found, one with expected size of 71 kDa. Another protein of 60 kDa is copurified with R. obamensis DNA polymerase I large fragment (FIG. 3, lane 6). To determine if this 60 kDa protein was a protease degradation product, the N-terminus of the 60 kDa protein was sequenced. The first 15 residues (AAKDVKFGNDARVKM (SEQ ID NO:15)) are compared with protein data base. It has 100% identity to E. coli GroEL protein. It was concluded that the 60 kDa protein is not a protease degradation product. Since R. obamensis DNA polymerase I large fragment is a foreign protein to E. coli, perhaps it needs more GroEL protein to help it to fold correctly.

[0053] To increase stability of the T7 expression clone, ER2566[pLysS] was transformed with the plasmid carrying Rob polI large fragment. The final expression strain is ER2566[pAII17-Rob polI large fragment, pLysS], Amp^(R) and Cm^(R).

[0054] A sample of the E. coli containing ER2566[pAII17-Rob polI large fragment, pLysS], (NEB#1186) has been deposited under the terms and conditions of the Budapest Treaty with the American Type Culture Collection on Mar. ______, 1999 and received ATCC Accession No. ______.

[0055] To determine the half-life of the partially purified large fragment, the protein is heated at 94° C. for 1 to 40 min. Samples are taken and assayed for remaining DNA polymerase activity. DNA polymerase assay was about the same as described above except that 5 ul of the heat-treated large fragment was used in the assay. The time of heat treatment was plotted against the percentage of remaining DNA polymerase activity. It was found that R. obamensis DNA polymerase I large fragment has an half-life of 35 min at 94° C. (FIG. 4).

[0056] During the course of this work, the DNA polymerase I gene was cloned from Rhodothermus sp. ‘ITI518’ and was released in GenBank on Jan. 1, 1999 (Blondal et al., GenBank Accession No. AF028719). Rhodothermus obamensis and Rhodothermus sp. ‘ITI518’ DNA polymerase I share 98% amino acid sequence identity. However, the thermostability of Rhodothermus obamensis and Rhodothermus sp. ‘ITI518’ DNA polymerase I large fragments are different. It was reported that the half-life of Rhodothermus sp. ‘ITI518’ DNA polymerase I large fragment at 90° C. is about 10 min (Blondal, T. et al. International Conference: Thermophile 98, Abstract, page G-P20). R. obamensis DNA polymerase I large fragment is more thermostable. It has an half-life of 35 min at 94° C. There are two possible explanations. One possibility is that R. obamensis DNA polymerase I large fragment has a different N-terminus than Rhodothermus sp. ‘ITI518’ DNA polymerase I large fragment (due to different aa deletion in the 5′-3′ exonuclease region). It's known that N-terminus deletion of 5′-3′ exonuclease domain can increase thermostability of DNA polymerases. The second possibility is that R. obamensis DNA polymerase I large fragment fortuitously copurified with E. coli protein GroEL, which is a chaperon for protein folding. The inclusion of GroEL protein in the polymerase assay may increase the thermostability of R. obamensis DNA polymerase I large fragment at 94° C.

EXAMPLE III Expression of R. obamensis DNA Polymerase I and its Large Fragment in any Expression Host

[0057]R. obamensis DNA polymerase I gene or its deletion derivative can be amplified by PCR using primers. The deletion can be in the 5′-3′ or 3′-5′ exonuclease domains. Alternatively, the active site residues of 5′-3′ or 3′-5′ exonuclease domains can mutagenized without affecting the DNA polymerase domain. Restriction sites can be engineered in the PCR primers to aid the cloning of the PCR products into appropriate cloning vectors. PCR conditions can be 90-95° C. for 30 sec, 50-65° C. for 30 sec. and 72° C. for 1-3 min, 20-30 cycles, 1-5 units of Vent® DNA polymerase or any proofreading DNA polymerase. PCR products can be digested with appropriate restriction enzymes. After ligation of PCR products to vectors, the ligated DNA can be used to transform expression host by transformation or electroporation. Plasmid mini-preparations can be made to screen inserts. Once the correct inserts are found, cells can be induced to produce the desired proteins. Cell extract can be prepared by lysozyme treatment or sonication and centrifugation. The cleared lysate can be heated at 65-85° C. for 30-60 min. Heat-denatured E. coli proteins can be removed by centrifugation and the supernatant can be analyzed on an SDS-PAG gel. The lysate can be assayed at 65-85° C. for DNA polymerase activity on activated calf thymus DNA or single-stranded DNA with a primer. The DNA polymerase activity can be , performed in a total of 50-100 ul volume at 65-85° C. It contains 1-20 ul of cell extract, 5 ul (10 ug) of activated calf thymus DNA, 1 ul of dNTP (5.4 mM), 5 ul of 10× thermopol buffer or any DNA polymerase buffer, 1 ul of [³H]TTP, 18 ul of sdH₂O. The components of 1× Thermopol buffer are 10 mM KCl, 20 mM Tris-HCl, pH 8.8, 10 mM (NH₄)₂SO₄, 2 mM MgSO₄, 0.1% Triton X-100. Following incubation at 65-85° C. for 10-30 min, the entire volume can be spotted on to DE81 membrane discs and dried. The membranes can be washed 1-2× in 500 ml of 10% TCA. The acid-insoluble [³H]TMP incorporated DNA can be counted in scintillation counting solution. R. obamensis DNA polymerase I and its large fragments can be purified by chromatography through affinity column, cation/anion exchange columns, or gel filtration columns.

[0058] To determine the half-life of the partially purified large fragment, the protein can be heated at 94° C. for 1 to 60 min. Samples can be taken and assayed for remaining DNA polymerase activity. The time course can be plotted against the percentage of remaining DNA polymerase activity. Heat shock proteins such as GroEL chaperon can be added to the polymerase reaction to increase the thermostability of DNA polymerase.

1 15 1 2775 DNA Rhodothermus obamensis CDS (1)..(2772) 1 atg cag cgc ctg tac ctg atc gat gcc atg gcg ctg gcc tat cgg gcg 48 Met Gln Arg Leu Tyr Leu Ile Asp Ala Met Ala Leu Ala Tyr Arg Ala 1 5 10 15 caa tac gtg ttc atc agc cgg ccg ctt gtc aac tcg aag gga cag aac 96 Gln Tyr Val Phe Ile Ser Arg Pro Leu Val Asn Ser Lys Gly Gln Asn 20 25 30 acc tcg gcc gcc tac ggt ttt acg acc tcc ctt ctg aag ctg atc gaa 144 Thr Ser Ala Ala Tyr Gly Phe Thr Thr Ser Leu Leu Lys Leu Ile Glu 35 40 45 gaa cac ggc atg gac tac atg gcc gtg gtc ttc gac gcc ggc ggg gag 192 Glu His Gly Met Asp Tyr Met Ala Val Val Phe Asp Ala Gly Gly Glu 50 55 60 gag ggc acg ttt cgc gaa gcg atc tat gag gaa tac aag gcg cat cgg 240 Glu Gly Thr Phe Arg Glu Ala Ile Tyr Glu Glu Tyr Lys Ala His Arg 65 70 75 80 gag ccg ccg ccg gaa gat ctg ctg gcc aac ctg ccc tgg atc aag gag 288 Glu Pro Pro Pro Glu Asp Leu Leu Ala Asn Leu Pro Trp Ile Lys Glu 85 90 95 atc gtc cgg gcg ctg gac att ccc gtc atc gag gag ccg ggc gtc gag 336 Ile Val Arg Ala Leu Asp Ile Pro Val Ile Glu Glu Pro Gly Val Glu 100 105 110 gcc gac gac gtg atc gga acg ctg gcc cgt cgg gcc gag gcg cac ggc 384 Ala Asp Asp Val Ile Gly Thr Leu Ala Arg Arg Ala Glu Ala His Gly 115 120 125 atc gac gtg gtg atc gtc tca ccc gac aag gac ttt ctg cag ctg ctg 432 Ile Asp Val Val Ile Val Ser Pro Asp Lys Asp Phe Leu Gln Leu Leu 130 135 140 agc ccg cac gtt tcc atc tac aaa ccg gcg cgg cgc ggc gaa acc ttc 480 Ser Pro His Val Ser Ile Tyr Lys Pro Ala Arg Arg Gly Glu Thr Phe 145 150 155 160 gac ctg atc acc atc gag act ttc cgg gag acc tac ggc ctg gag ccg 528 Asp Leu Ile Thr Ile Glu Thr Phe Arg Glu Thr Tyr Gly Leu Glu Pro 165 170 175 cac cag ttc atc gac gtg ctg gct ctc atg ggc gat ccg agc gac aat 576 His Gln Phe Ile Asp Val Leu Ala Leu Met Gly Asp Pro Ser Asp Asn 180 185 190 gtg ccg ggc gtg ccg ggc atc ggc gaa aag acc gcc gtg cag ctc atc 624 Val Pro Gly Val Pro Gly Ile Gly Glu Lys Thr Ala Val Gln Leu Ile 195 200 205 caa cag tac ggc tcg gtg gaa aac ctg ctg gcc cat gcc gag gag gtg 672 Gln Gln Tyr Gly Ser Val Glu Asn Leu Leu Ala His Ala Glu Glu Val 210 215 220 aaa ggg aag cgg gcc cgc gag ggg ctc ctg aac cac cgc gag gaa gcg 720 Lys Gly Lys Arg Ala Arg Glu Gly Leu Leu Asn His Arg Glu Glu Ala 225 230 235 240 ctc ctc tcg aag cgg ctg gtg acg atc cgg acc gat gtg ccg ttg cgc 768 Leu Leu Ser Lys Arg Leu Val Thr Ile Arg Thr Asp Val Pro Leu Arg 245 250 255 att cgc tgg gag gcg ttc cat cgc gcc cgg ccc gat ctg ccg cgc ctg 816 Ile Arg Trp Glu Ala Phe His Arg Ala Arg Pro Asp Leu Pro Arg Leu 260 265 270 ctg cag atc ttt cag gag ctg gaa ttc gac tcg ctg gtg cgg cgc atc 864 Leu Gln Ile Phe Gln Glu Leu Glu Phe Asp Ser Leu Val Arg Arg Ile 275 280 285 cgg gaa ggc gga ctg gcc ggc att gtg aac ggc gaa gcc gcc ttg gat 912 Arg Glu Gly Gly Leu Ala Gly Ile Val Asn Gly Glu Ala Ala Leu Asp 290 295 300 gag gcg ctt gaa gcg gag acc gag ccg gag ttc gat ttc ggg cca tac 960 Glu Ala Leu Glu Ala Glu Thr Glu Pro Glu Phe Asp Phe Gly Pro Tyr 305 310 315 320 gag ccg ctg cag gtg tac gat ccg gaa aag gcg gac tac cgg atc gtc 1008 Glu Pro Leu Gln Val Tyr Asp Pro Glu Lys Ala Asp Tyr Arg Ile Val 325 330 335 cgc aac cgc cag cag ctc gac gaa ctc gtg gcg cat ctg gac gga ttc 1056 Arg Asn Arg Gln Gln Leu Asp Glu Leu Val Ala His Leu Asp Gly Phe 340 345 350 gaa cgg ctg gcc atc gac acg gag acg act tcg acc gag gcc atg tgg 1104 Glu Arg Leu Ala Ile Asp Thr Glu Thr Thr Ser Thr Glu Ala Met Trp 355 360 365 gcc tcg ctg gtg ggc att gcc ttt tcc tgg gag aaa ggc cag ggc tac 1152 Ala Ser Leu Val Gly Ile Ala Phe Ser Trp Glu Lys Gly Gln Gly Tyr 370 375 380 tac gtg ccc acg ccg ctg ccg gac ggc acg ccg acc gag acg gtg ctc 1200 Tyr Val Pro Thr Pro Leu Pro Asp Gly Thr Pro Thr Glu Thr Val Leu 385 390 395 400 gag cga ctg gcg ccg atc ctc cga cgg gcg cag cgc aaa gtc ggt cag 1248 Glu Arg Leu Ala Pro Ile Leu Arg Arg Ala Gln Arg Lys Val Gly Gln 405 410 415 aac ctg aag tac gat ctg gtg gtg ctg gcg cgg cac ggc gtc caa gtc 1296 Asn Leu Lys Tyr Asp Leu Val Val Leu Ala Arg His Gly Val Gln Val 420 425 430 ccg ccc ccg tac ttc gac acg atg gtg gcg cac tac ctg att gcg ccc 1344 Pro Pro Pro Tyr Phe Asp Thr Met Val Ala His Tyr Leu Ile Ala Pro 435 440 445 gag gaa ccg cat aac ctg gac gtg ctg gcc cgc cag tac ctt cgc tac 1392 Glu Glu Pro His Asn Leu Asp Val Leu Ala Arg Gln Tyr Leu Arg Tyr 450 455 460 cag atg gtt tcc atc acg gaa ctg atc ggc tcg ggt cgc gac cag aag 1440 Gln Met Val Ser Ile Thr Glu Leu Ile Gly Ser Gly Arg Asp Gln Lys 465 470 475 480 tcc atg cgc gac gtg tcg atc gac gag gtg ggg ccc tat gcc tgt gaa 1488 Ser Met Arg Asp Val Ser Ile Asp Glu Val Gly Pro Tyr Ala Cys Glu 485 490 495 gac acg gac att gcg ctg caa ctg gcc gat gtg ctg gcc gcc gag ttg 1536 Asp Thr Asp Ile Ala Leu Gln Leu Ala Asp Val Leu Ala Ala Glu Leu 500 505 510 gac cga cac gga ctc cgg cat atc gcc gag gag atg gag ttc ccg ctc 1584 Asp Arg His Gly Leu Arg His Ile Ala Glu Glu Met Glu Phe Pro Leu 515 520 525 atc gag gtg ctg gcc gat atg gag cgg acg ggc atc tgc atc gat cgc 1632 Ile Glu Val Leu Ala Asp Met Glu Arg Thr Gly Ile Cys Ile Asp Arg 530 535 540 gcg gtg ctt cgg gaa atc ggt aag caa ctc gaa gcg gag ctt cac gaa 1680 Ala Val Leu Arg Glu Ile Gly Lys Gln Leu Glu Ala Glu Leu His Glu 545 550 555 560 ctg gag gtg aag atc tat gag gtg gcc ggc gtc gaa ttc aac atc ggc 1728 Leu Glu Val Lys Ile Tyr Glu Val Ala Gly Val Glu Phe Asn Ile Gly 565 570 575 tcg ccg cag caa ctg gcg gac gtc ttg ttc aag aag ctc ggg ttg aag 1776 Ser Pro Gln Gln Leu Ala Asp Val Leu Phe Lys Lys Leu Gly Leu Lys 580 585 590 ccg cgg gcg cgc acc agc acc ggc cgg cct tcc acc aaa gag agc gtg 1824 Pro Arg Ala Arg Thr Ser Thr Gly Arg Pro Ser Thr Lys Glu Ser Val 595 600 605 ctg cag gag ctg gcc acg cag cac ccg ctc ccc ggc ctg atc ctg gac 1872 Leu Gln Glu Leu Ala Thr Gln His Pro Leu Pro Gly Leu Ile Leu Asp 610 615 620 tgg cga cac ctg gcc aag ctc aaa agc acc tac gtg gac ggc ctc gag 1920 Trp Arg His Leu Ala Lys Leu Lys Ser Thr Tyr Val Asp Gly Leu Glu 625 630 635 640 ccg ctc atc cat ccg gag acc ggc cgc atc cac acc acg ttc aac cag 1968 Pro Leu Ile His Pro Glu Thr Gly Arg Ile His Thr Thr Phe Asn Gln 645 650 655 acg gtg acg gct acc ggg cgg ctt tcc tcg agc aac ccg aac ctg cag 2016 Thr Val Thr Ala Thr Gly Arg Leu Ser Ser Ser Asn Pro Asn Leu Gln 660 665 670 aac atc ccg gtt cgc acc gag atg ggg cgg gag atc cgc agg gcg ttt 2064 Asn Ile Pro Val Arg Thr Glu Met Gly Arg Glu Ile Arg Arg Ala Phe 675 680 685 gtg ccg cgg ccg ggc tgg aag ctg ctc tcg gcc gac tac gtc cag atc 2112 Val Pro Arg Pro Gly Trp Lys Leu Leu Ser Ala Asp Tyr Val Gln Ile 690 695 700 gaa ctt cgc att ctg gcc gcg ctg agc ggc gac gag gcg ctt cgc cgg 2160 Glu Leu Arg Ile Leu Ala Ala Leu Ser Gly Asp Glu Ala Leu Arg Arg 705 710 715 720 gcc ttt ctg gag gga cag gac atc cat acg gcc acg gca gcc cgc gtc 2208 Ala Phe Leu Glu Gly Gln Asp Ile His Thr Ala Thr Ala Ala Arg Val 725 730 735 ttc aag gtg ccg ccc gag cag gtg acg ccc gag cag cgc cgc cgc gcc 2256 Phe Lys Val Pro Pro Glu Gln Val Thr Pro Glu Gln Arg Arg Arg Ala 740 745 750 aag atg gtc aac tac ggc att ccc tac ggg att tcg gcc tgg ggg ctg 2304 Lys Met Val Asn Tyr Gly Ile Pro Tyr Gly Ile Ser Ala Trp Gly Leu 755 760 765 gcg cag cgg ctt cgc tgc tcc acg cgc gag gcg cag gag ctt atc gaa 2352 Ala Gln Arg Leu Arg Cys Ser Thr Arg Glu Ala Gln Glu Leu Ile Glu 770 775 780 gaa tat cag cgg gcc ttt ccg ggc gtg acg cgc tac ctg cac cgc gtc 2400 Glu Tyr Gln Arg Ala Phe Pro Gly Val Thr Arg Tyr Leu His Arg Val 785 790 795 800 gtc gaa gag gcc cgc cag aag ggc tac gtc gag acg ctg ctg ggc cgc 2448 Val Glu Glu Ala Arg Gln Lys Gly Tyr Val Glu Thr Leu Leu Gly Arg 805 810 815 cgc cgc tac gta ccg aac atc aac tcc cgc aac cgg gcc gag cgc tcg 2496 Arg Arg Tyr Val Pro Asn Ile Asn Ser Arg Asn Arg Ala Glu Arg Ser 820 825 830 atg gcc gaa cgc atc gcc gtg aac atg ccc atc cag ggc acg cag gcc 2544 Met Ala Glu Arg Ile Ala Val Asn Met Pro Ile Gln Gly Thr Gln Ala 835 840 845 gac atg atc aag ctg gcc atg gtg cac atc tac cac cga ctg aag cgg 2592 Asp Met Ile Lys Leu Ala Met Val His Ile Tyr His Arg Leu Lys Arg 850 855 860 gaa ggc tac cgg gcc aag atg ctg ctc cag gtg cac gac gag ctg gtc 2640 Glu Gly Tyr Arg Ala Lys Met Leu Leu Gln Val His Asp Glu Leu Val 865 870 875 880 ttc gag atg ccc ccc gaa gag gtg gag ccc gtg cgc caa ctg gtc gag 2688 Phe Glu Met Pro Pro Glu Glu Val Glu Pro Val Arg Gln Leu Val Glu 885 890 895 cag gag atg aag cag gcc ctg ccg ctg gaa ggt gtg ccc atc gag gtg 2736 Gln Glu Met Lys Gln Ala Leu Pro Leu Glu Gly Val Pro Ile Glu Val 900 905 910 gac atc ggc gtc ggc gac aac tgg ctg gat gcc cac tga 2775 Asp Ile Gly Val Gly Asp Asn Trp Leu Asp Ala His 915 920 2 924 PRT Rhodothermus obamensis 2 Met Gln Arg Leu Tyr Leu Ile Asp Ala Met Ala Leu Ala Tyr Arg Ala 1 5 10 15 Gln Tyr Val Phe Ile Ser Arg Pro Leu Val Asn Ser Lys Gly Gln Asn 20 25 30 Thr Ser Ala Ala Tyr Gly Phe Thr Thr Ser Leu Leu Lys Leu Ile Glu 35 40 45 Glu His Gly Met Asp Tyr Met Ala Val Val Phe Asp Ala Gly Gly Glu 50 55 60 Glu Gly Thr Phe Arg Glu Ala Ile Tyr Glu Glu Tyr Lys Ala His Arg 65 70 75 80 Glu Pro Pro Pro Glu Asp Leu Leu Ala Asn Leu Pro Trp Ile Lys Glu 85 90 95 Ile Val Arg Ala Leu Asp Ile Pro Val Ile Glu Glu Pro Gly Val Glu 100 105 110 Ala Asp Asp Val Ile Gly Thr Leu Ala Arg Arg Ala Glu Ala His Gly 115 120 125 Ile Asp Val Val Ile Val Ser Pro Asp Lys Asp Phe Leu Gln Leu Leu 130 135 140 Ser Pro His Val Ser Ile Tyr Lys Pro Ala Arg Arg Gly Glu Thr Phe 145 150 155 160 Asp Leu Ile Thr Ile Glu Thr Phe Arg Glu Thr Tyr Gly Leu Glu Pro 165 170 175 His Gln Phe Ile Asp Val Leu Ala Leu Met Gly Asp Pro Ser Asp Asn 180 185 190 Val Pro Gly Val Pro Gly Ile Gly Glu Lys Thr Ala Val Gln Leu Ile 195 200 205 Gln Gln Tyr Gly Ser Val Glu Asn Leu Leu Ala His Ala Glu Glu Val 210 215 220 Lys Gly Lys Arg Ala Arg Glu Gly Leu Leu Asn His Arg Glu Glu Ala 225 230 235 240 Leu Leu Ser Lys Arg Leu Val Thr Ile Arg Thr Asp Val Pro Leu Arg 245 250 255 Ile Arg Trp Glu Ala Phe His Arg Ala Arg Pro Asp Leu Pro Arg Leu 260 265 270 Leu Gln Ile Phe Gln Glu Leu Glu Phe Asp Ser Leu Val Arg Arg Ile 275 280 285 Arg Glu Gly Gly Leu Ala Gly Ile Val Asn Gly Glu Ala Ala Leu Asp 290 295 300 Glu Ala Leu Glu Ala Glu Thr Glu Pro Glu Phe Asp Phe Gly Pro Tyr 305 310 315 320 Glu Pro Leu Gln Val Tyr Asp Pro Glu Lys Ala Asp Tyr Arg Ile Val 325 330 335 Arg Asn Arg Gln Gln Leu Asp Glu Leu Val Ala His Leu Asp Gly Phe 340 345 350 Glu Arg Leu Ala Ile Asp Thr Glu Thr Thr Ser Thr Glu Ala Met Trp 355 360 365 Ala Ser Leu Val Gly Ile Ala Phe Ser Trp Glu Lys Gly Gln Gly Tyr 370 375 380 Tyr Val Pro Thr Pro Leu Pro Asp Gly Thr Pro Thr Glu Thr Val Leu 385 390 395 400 Glu Arg Leu Ala Pro Ile Leu Arg Arg Ala Gln Arg Lys Val Gly Gln 405 410 415 Asn Leu Lys Tyr Asp Leu Val Val Leu Ala Arg His Gly Val Gln Val 420 425 430 Pro Pro Pro Tyr Phe Asp Thr Met Val Ala His Tyr Leu Ile Ala Pro 435 440 445 Glu Glu Pro His Asn Leu Asp Val Leu Ala Arg Gln Tyr Leu Arg Tyr 450 455 460 Gln Met Val Ser Ile Thr Glu Leu Ile Gly Ser Gly Arg Asp Gln Lys 465 470 475 480 Ser Met Arg Asp Val Ser Ile Asp Glu Val Gly Pro Tyr Ala Cys Glu 485 490 495 Asp Thr Asp Ile Ala Leu Gln Leu Ala Asp Val Leu Ala Ala Glu Leu 500 505 510 Asp Arg His Gly Leu Arg His Ile Ala Glu Glu Met Glu Phe Pro Leu 515 520 525 Ile Glu Val Leu Ala Asp Met Glu Arg Thr Gly Ile Cys Ile Asp Arg 530 535 540 Ala Val Leu Arg Glu Ile Gly Lys Gln Leu Glu Ala Glu Leu His Glu 545 550 555 560 Leu Glu Val Lys Ile Tyr Glu Val Ala Gly Val Glu Phe Asn Ile Gly 565 570 575 Ser Pro Gln Gln Leu Ala Asp Val Leu Phe Lys Lys Leu Gly Leu Lys 580 585 590 Pro Arg Ala Arg Thr Ser Thr Gly Arg Pro Ser Thr Lys Glu Ser Val 595 600 605 Leu Gln Glu Leu Ala Thr Gln His Pro Leu Pro Gly Leu Ile Leu Asp 610 615 620 Trp Arg His Leu Ala Lys Leu Lys Ser Thr Tyr Val Asp Gly Leu Glu 625 630 635 640 Pro Leu Ile His Pro Glu Thr Gly Arg Ile His Thr Thr Phe Asn Gln 645 650 655 Thr Val Thr Ala Thr Gly Arg Leu Ser Ser Ser Asn Pro Asn Leu Gln 660 665 670 Asn Ile Pro Val Arg Thr Glu Met Gly Arg Glu Ile Arg Arg Ala Phe 675 680 685 Val Pro Arg Pro Gly Trp Lys Leu Leu Ser Ala Asp Tyr Val Gln Ile 690 695 700 Glu Leu Arg Ile Leu Ala Ala Leu Ser Gly Asp Glu Ala Leu Arg Arg 705 710 715 720 Ala Phe Leu Glu Gly Gln Asp Ile His Thr Ala Thr Ala Ala Arg Val 725 730 735 Phe Lys Val Pro Pro Glu Gln Val Thr Pro Glu Gln Arg Arg Arg Ala 740 745 750 Lys Met Val Asn Tyr Gly Ile Pro Tyr Gly Ile Ser Ala Trp Gly Leu 755 760 765 Ala Gln Arg Leu Arg Cys Ser Thr Arg Glu Ala Gln Glu Leu Ile Glu 770 775 780 Glu Tyr Gln Arg Ala Phe Pro Gly Val Thr Arg Tyr Leu His Arg Val 785 790 795 800 Val Glu Glu Ala Arg Gln Lys Gly Tyr Val Glu Thr Leu Leu Gly Arg 805 810 815 Arg Arg Tyr Val Pro Asn Ile Asn Ser Arg Asn Arg Ala Glu Arg Ser 820 825 830 Met Ala Glu Arg Ile Ala Val Asn Met Pro Ile Gln Gly Thr Gln Ala 835 840 845 Asp Met Ile Lys Leu Ala Met Val His Ile Tyr His Arg Leu Lys Arg 850 855 860 Glu Gly Tyr Arg Ala Lys Met Leu Leu Gln Val His Asp Glu Leu Val 865 870 875 880 Phe Glu Met Pro Pro Glu Glu Val Glu Pro Val Arg Gln Leu Val Glu 885 890 895 Gln Glu Met Lys Gln Ala Leu Pro Leu Glu Gly Val Pro Ile Glu Val 900 905 910 Asp Ile Gly Val Gly Asp Asn Trp Leu Asp Ala His 915 920 3 1887 DNA Rhodothermus obamensis CDS (1)..(1884) 3 atg aac ggc gaa gcc gcc ttg gat gag gcg ctt gaa gcg gag acc gag 48 Met Asn Gly Glu Ala Ala Leu Asp Glu Ala Leu Glu Ala Glu Thr Glu 1 5 10 15 ccg gag ttc gat ttc ggg cca tac gag ccg ctg cag gtg tac gat ccg 96 Pro Glu Phe Asp Phe Gly Pro Tyr Glu Pro Leu Gln Val Tyr Asp Pro 20 25 30 gaa aag gcg gac tac cgg atc gtc cgc aac cgc cag cag ctc gac gaa 144 Glu Lys Ala Asp Tyr Arg Ile Val Arg Asn Arg Gln Gln Leu Asp Glu 35 40 45 ctc gtg gcg cat ctg gac gga ttc gaa cgg ctg gcc atc gac acg gag 192 Leu Val Ala His Leu Asp Gly Phe Glu Arg Leu Ala Ile Asp Thr Glu 50 55 60 acg act tcg acc gag gcc atg tgg gcc tcg ctg gtg ggc att gcc ttt 240 Thr Thr Ser Thr Glu Ala Met Trp Ala Ser Leu Val Gly Ile Ala Phe 65 70 75 80 tcc tgg gag aaa ggc cag ggc tac tac gtg ccc acg ccg ctg ccg gac 288 Ser Trp Glu Lys Gly Gln Gly Tyr Tyr Val Pro Thr Pro Leu Pro Asp 85 90 95 ggc acg ccg acc gag acg gtg ctc gag cga ctg gcg ccg atc ctc cga 336 Gly Thr Pro Thr Glu Thr Val Leu Glu Arg Leu Ala Pro Ile Leu Arg 100 105 110 cgg gcg cag cgc aaa gtc ggt cag aac ctg aag tac gat ctg gtg gtg 384 Arg Ala Gln Arg Lys Val Gly Gln Asn Leu Lys Tyr Asp Leu Val Val 115 120 125 ctg gcg cgg cac ggc gtc caa gtc ccg ccc ccg tac ttc gac acg atg 432 Leu Ala Arg His Gly Val Gln Val Pro Pro Pro Tyr Phe Asp Thr Met 130 135 140 gtg gcg cac tac ctg att gcg ccc gag gaa ccg cat aac ctg gac gtg 480 Val Ala His Tyr Leu Ile Ala Pro Glu Glu Pro His Asn Leu Asp Val 145 150 155 160 ctg gcc cgc cag tac ctt cgc tac cag atg gtt tcc atc acg gaa ctg 528 Leu Ala Arg Gln Tyr Leu Arg Tyr Gln Met Val Ser Ile Thr Glu Leu 165 170 175 atc ggc tcg ggt cgc gac cag aag tcc atg cgc gac gtg tcg atc gac 576 Ile Gly Ser Gly Arg Asp Gln Lys Ser Met Arg Asp Val Ser Ile Asp 180 185 190 gag gtg ggg ccc tat gcc tgt gaa gac acg gac att gcg ctg caa ctg 624 Glu Val Gly Pro Tyr Ala Cys Glu Asp Thr Asp Ile Ala Leu Gln Leu 195 200 205 gcc gat gtg ctg gcc gcc gag ttg gac cga cac gga ctc cgg cat atc 672 Ala Asp Val Leu Ala Ala Glu Leu Asp Arg His Gly Leu Arg His Ile 210 215 220 gcc gag gag atg gag ttc ccg ctc atc gag gtg ctg gcc gat atg gag 720 Ala Glu Glu Met Glu Phe Pro Leu Ile Glu Val Leu Ala Asp Met Glu 225 230 235 240 cgg acg ggc atc tgc atc gat cgc gcg gtg ctt cgg gaa atc ggt aag 768 Arg Thr Gly Ile Cys Ile Asp Arg Ala Val Leu Arg Glu Ile Gly Lys 245 250 255 caa ctc gaa gcg gag ctt cac gaa ctg gag gtg aag atc tat gag gtg 816 Gln Leu Glu Ala Glu Leu His Glu Leu Glu Val Lys Ile Tyr Glu Val 260 265 270 gcc ggc gtc gaa ttc aac atc ggc tcg ccg cag caa ctg gcg gac gtc 864 Ala Gly Val Glu Phe Asn Ile Gly Ser Pro Gln Gln Leu Ala Asp Val 275 280 285 ttg ttc aag aag ctc ggg ttg aag ccg cgg gcg cgc acc agc acc ggc 912 Leu Phe Lys Lys Leu Gly Leu Lys Pro Arg Ala Arg Thr Ser Thr Gly 290 295 300 cgg cct tcc acc aaa gag agc gtg ctg cag gag ctg gcc acg cag cac 960 Arg Pro Ser Thr Lys Glu Ser Val Leu Gln Glu Leu Ala Thr Gln His 305 310 315 320 ccg ctc ccc ggc ctg atc ctg gac tgg cga cac ctg gcc aag ctc aaa 1008 Pro Leu Pro Gly Leu Ile Leu Asp Trp Arg His Leu Ala Lys Leu Lys 325 330 335 agc acc tac gtg gac ggc ctc gag ccg ctc atc cat ccg gag acc ggc 1056 Ser Thr Tyr Val Asp Gly Leu Glu Pro Leu Ile His Pro Glu Thr Gly 340 345 350 cgc atc cac acc acg ttc aac cag acg gtg acg gct acc ggg cgg ctt 1104 Arg Ile His Thr Thr Phe Asn Gln Thr Val Thr Ala Thr Gly Arg Leu 355 360 365 tcc tcg agc aac ccg aac ctg cag aac atc ccg gtt cgc acc gag atg 1152 Ser Ser Ser Asn Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Glu Met 370 375 380 ggg cgg gag atc cgc agg gcg ttt gtg ccg cgg ccg ggc tgg aag ctg 1200 Gly Arg Glu Ile Arg Arg Ala Phe Val Pro Arg Pro Gly Trp Lys Leu 385 390 395 400 ctc tcg gcc gac tac gtc cag atc gaa ctt cgc att ctg gcc gcg ctg 1248 Leu Ser Ala Asp Tyr Val Gln Ile Glu Leu Arg Ile Leu Ala Ala Leu 405 410 415 agc ggc gac gag gcg ctt cgc cgg gcc ttt ctg gag gga cag gac atc 1296 Ser Gly Asp Glu Ala Leu Arg Arg Ala Phe Leu Glu Gly Gln Asp Ile 420 425 430 cat acg gcc acg gca gcc cgc gtc ttc aag gtg ccg ccc gag cag gtg 1344 His Thr Ala Thr Ala Ala Arg Val Phe Lys Val Pro Pro Glu Gln Val 435 440 445 acg ccc gag cag cgc cgc cgc gcc aag atg gtc aac tac ggc att ccc 1392 Thr Pro Glu Gln Arg Arg Arg Ala Lys Met Val Asn Tyr Gly Ile Pro 450 455 460 tac ggg att tcg gcc tgg ggg ctg gcg cag cgg ctt cgc tgc tcc acg 1440 Tyr Gly Ile Ser Ala Trp Gly Leu Ala Gln Arg Leu Arg Cys Ser Thr 465 470 475 480 cgc gag gcg cag gag ctt atc gaa gaa tat cag cgg gcc ttt ccg ggc 1488 Arg Glu Ala Gln Glu Leu Ile Glu Glu Tyr Gln Arg Ala Phe Pro Gly 485 490 495 gtg acg cgc tac ctg cac cgc gtc gtc gaa gag gcc cgc cag aag ggc 1536 Val Thr Arg Tyr Leu His Arg Val Val Glu Glu Ala Arg Gln Lys Gly 500 505 510 tac gtc gag acg ctg ctg ggc cgc cgc cgc tac gta ccg aac atc aac 1584 Tyr Val Glu Thr Leu Leu Gly Arg Arg Arg Tyr Val Pro Asn Ile Asn 515 520 525 tcc cgc aac cgg gcc gag cgc tcg atg gcc gaa cgc atc gcc gtg aac 1632 Ser Arg Asn Arg Ala Glu Arg Ser Met Ala Glu Arg Ile Ala Val Asn 530 535 540 atg ccc atc cag ggc acg cag gcc gac atg atc aag ctg gcc atg gtg 1680 Met Pro Ile Gln Gly Thr Gln Ala Asp Met Ile Lys Leu Ala Met Val 545 550 555 560 cac atc tac cac cga ctg aag cgg gaa ggc tac cgg gcc aag atg ctg 1728 His Ile Tyr His Arg Leu Lys Arg Glu Gly Tyr Arg Ala Lys Met Leu 565 570 575 ctc cag gtg cac gac gag ctg gtc ttc gag atg ccc ccc gaa gag gtg 1776 Leu Gln Val His Asp Glu Leu Val Phe Glu Met Pro Pro Glu Glu Val 580 585 590 gag ccc gtg cgc caa ctg gtc gag cag gag atg aag cag gcc ctg ccg 1824 Glu Pro Val Arg Gln Leu Val Glu Gln Glu Met Lys Gln Ala Leu Pro 595 600 605 ctg gaa ggt gtg ccc atc gag gtg gac atc ggc gtc ggc gac aac tgg 1872 Leu Glu Gly Val Pro Ile Glu Val Asp Ile Gly Val Gly Asp Asn Trp 610 615 620 ctg gat gcc cac tga 1887 Leu Asp Ala His 625 4 628 PRT Rhodothermus obamensis 4 Met Asn Gly Glu Ala Ala Leu Asp Glu Ala Leu Glu Ala Glu Thr Glu 1 5 10 15 Pro Glu Phe Asp Phe Gly Pro Tyr Glu Pro Leu Gln Val Tyr Asp Pro 20 25 30 Glu Lys Ala Asp Tyr Arg Ile Val Arg Asn Arg Gln Gln Leu Asp Glu 35 40 45 Leu Val Ala His Leu Asp Gly Phe Glu Arg Leu Ala Ile Asp Thr Glu 50 55 60 Thr Thr Ser Thr Glu Ala Met Trp Ala Ser Leu Val Gly Ile Ala Phe 65 70 75 80 Ser Trp Glu Lys Gly Gln Gly Tyr Tyr Val Pro Thr Pro Leu Pro Asp 85 90 95 Gly Thr Pro Thr Glu Thr Val Leu Glu Arg Leu Ala Pro Ile Leu Arg 100 105 110 Arg Ala Gln Arg Lys Val Gly Gln Asn Leu Lys Tyr Asp Leu Val Val 115 120 125 Leu Ala Arg His Gly Val Gln Val Pro Pro Pro Tyr Phe Asp Thr Met 130 135 140 Val Ala His Tyr Leu Ile Ala Pro Glu Glu Pro His Asn Leu Asp Val 145 150 155 160 Leu Ala Arg Gln Tyr Leu Arg Tyr Gln Met Val Ser Ile Thr Glu Leu 165 170 175 Ile Gly Ser Gly Arg Asp Gln Lys Ser Met Arg Asp Val Ser Ile Asp 180 185 190 Glu Val Gly Pro Tyr Ala Cys Glu Asp Thr Asp Ile Ala Leu Gln Leu 195 200 205 Ala Asp Val Leu Ala Ala Glu Leu Asp Arg His Gly Leu Arg His Ile 210 215 220 Ala Glu Glu Met Glu Phe Pro Leu Ile Glu Val Leu Ala Asp Met Glu 225 230 235 240 Arg Thr Gly Ile Cys Ile Asp Arg Ala Val Leu Arg Glu Ile Gly Lys 245 250 255 Gln Leu Glu Ala Glu Leu His Glu Leu Glu Val Lys Ile Tyr Glu Val 260 265 270 Ala Gly Val Glu Phe Asn Ile Gly Ser Pro Gln Gln Leu Ala Asp Val 275 280 285 Leu Phe Lys Lys Leu Gly Leu Lys Pro Arg Ala Arg Thr Ser Thr Gly 290 295 300 Arg Pro Ser Thr Lys Glu Ser Val Leu Gln Glu Leu Ala Thr Gln His 305 310 315 320 Pro Leu Pro Gly Leu Ile Leu Asp Trp Arg His Leu Ala Lys Leu Lys 325 330 335 Ser Thr Tyr Val Asp Gly Leu Glu Pro Leu Ile His Pro Glu Thr Gly 340 345 350 Arg Ile His Thr Thr Phe Asn Gln Thr Val Thr Ala Thr Gly Arg Leu 355 360 365 Ser Ser Ser Asn Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Glu Met 370 375 380 Gly Arg Glu Ile Arg Arg Ala Phe Val Pro Arg Pro Gly Trp Lys Leu 385 390 395 400 Leu Ser Ala Asp Tyr Val Gln Ile Glu Leu Arg Ile Leu Ala Ala Leu 405 410 415 Ser Gly Asp Glu Ala Leu Arg Arg Ala Phe Leu Glu Gly Gln Asp Ile 420 425 430 His Thr Ala Thr Ala Ala Arg Val Phe Lys Val Pro Pro Glu Gln Val 435 440 445 Thr Pro Glu Gln Arg Arg Arg Ala Lys Met Val Asn Tyr Gly Ile Pro 450 455 460 Tyr Gly Ile Ser Ala Trp Gly Leu Ala Gln Arg Leu Arg Cys Ser Thr 465 470 475 480 Arg Glu Ala Gln Glu Leu Ile Glu Glu Tyr Gln Arg Ala Phe Pro Gly 485 490 495 Val Thr Arg Tyr Leu His Arg Val Val Glu Glu Ala Arg Gln Lys Gly 500 505 510 Tyr Val Glu Thr Leu Leu Gly Arg Arg Arg Tyr Val Pro Asn Ile Asn 515 520 525 Ser Arg Asn Arg Ala Glu Arg Ser Met Ala Glu Arg Ile Ala Val Asn 530 535 540 Met Pro Ile Gln Gly Thr Gln Ala Asp Met Ile Lys Leu Ala Met Val 545 550 555 560 His Ile Tyr His Arg Leu Lys Arg Glu Gly Tyr Arg Ala Lys Met Leu 565 570 575 Leu Gln Val His Asp Glu Leu Val Phe Glu Met Pro Pro Glu Glu Val 580 585 590 Glu Pro Val Arg Gln Leu Val Glu Gln Glu Met Lys Gln Ala Leu Pro 595 600 605 Leu Glu Gly Val Pro Ile Glu Val Asp Ile Gly Val Gly Asp Asn Trp 610 615 620 Leu Asp Ala His 625 5 26 DNA synthetic 5 tccgayccca acctscagaa catccc 26 6 23 DNA Synthetic 6 aggassagct cgtcgtgsac ctg 23 7 21 DNA synthetic 7 cgcagggcgt ttgtgccgcg g 21 8 21 DNA synthetic 8 gtctcccgcc ccatctcggt g 21 9 21 DNA synthetic 9 gccggccgct tgtcaactcg a 21 10 21 DNA synthetic 10 tgatgaacac gtattgcgcc c 21 11 24 DNA Synthetic 11 gaagcgggaa ggctaccggg ccaa 24 12 24 DNA Synthetic 12 agtcggtggt agatgtgcac catg 24 13 39 DNA Synthetic 13 ctggccggcc atatgaacgg cgaagccgcc ttggatgag 39 14 36 DNA Synthetic 14 gttggatccg cttcagtggg catccagcca gttgtc 36 15 15 PRT Escherichia coli 15 Ala Ala Lys Asp Val Lys Phe Gly Asn Asp Ala Arg Val Lys Met 1 5 10 15 

What is claimed is:
 1. A substantially pure thermostable DNA polymerase I obtainable from Rhodothermus obamensis (JCM 9785).
 2. The substantially pure thermostable DNA polymerase I of claim 1, wherein said polymerase possesses 3′-5′ exonuclease activity and has a half-life of about 35 minutes at 94° C.
 3. The substantially pure thermostable DNA polymerase I of claim 2, wherein said polymerase has a molecular weight of about 104 kDa.
 4. The substantially pure thermostable DNA polymerase I of claim 2, wherein said polymerase is encoded by a DNA segment comprising the DNA sequence of SEQ ID NO: 1 or functional equivalents thereof.
 5. An isolated DNA segment encoding the thermostable DNA polymerase I of claim
 1. 6. The isolated DNA segment of claim 6, wherein said DNA segment comprises the DNA sequence of SEQ ID NO: 1 or functional equivalents thereof.
 7. An isolated DNA segment encoding the large fragment of Rhodothermus obamensis (JCM 9785) DNA polymerase I.
 8. The isolated DNA segment of claim 7, wherein said DNA segment comprises the DNA sequence of SEQ ID NO: 3 or functional equivalents thereof.
 9. A recombinant vector comprising the isolated DNA of any one of claims 5-8.
 10. The recombinant vector of claim 9, wherein said vector comprises pAII17-Rob polI large fragment, pLysS.
 11. A host cell transformed with the recombinant vector of claims 9 or
 10. 12. The transformed host cell of claim 11, wherein said transformed host cell comprises E. coli ER2566[pAII17-Rob polI large fragment, pLysS] (ATCC No. ______).
 13. A method for producing recombinant R.obamensis DNA polymerase I, said method comprising culturing the transformed host cell of claim 11 under conditions suitable to allow the expression of said recombinant R. obamensis DNA polymerase I and recovering recombinant R. obamensis DNA polymerase I.
 14. A recombinant R. obamensis DNA polymerase I produced by the method of claim
 13. 15. A recombinant R. obamensis DNA polymerase I large fragment, wherein said polymerase I large fragment has a molecular weight of about 71 kDa, possesses 3′-5′ exonuclease activity and has a half-life of about 35 minutes at 94° C.
 16. A DNA polymerase I composition comprising the recombinant R. obamensis DNA polymerase I large fragment of claim 15 and an approximately 60 kDa E. coli GroEL protein, wherein the thermostability of said polymerase I large fragment is increased by the presence of said GroEL protein. 